16 research outputs found

    Hardware Acceleration for Unstructured Big Data and Natural Language Processing.

    Full text link
    The confluence of the rapid growth in electronic data in recent years, and the renewed interest in domain-specific hardware accelerators presents exciting technical opportunities. Traditional scale-out solutions for processing the vast amounts of text data have been shown to be energy- and cost-inefficient. In contrast, custom hardware accelerators can provide higher throughputs, lower latencies, and significant energy savings. In this thesis, I present a set of hardware accelerators for unstructured big-data processing and natural language processing. The first accelerator, called HAWK, aims to speed up the processing of ad hoc queries against large in-memory logs. HAWK is motivated by the observation that traditional software-based tools for processing large text corpora use memory bandwidth inefficiently due to software overheads, and, thus, fall far short of peak scan rates possible on modern memory systems. HAWK is designed to process data at a constant rate of 32 GB/s—faster than most extant memory systems. I demonstrate that HAWK outperforms state-of-the-art software solutions for text processing, almost by an order of magnitude in many cases. HAWK occupies an area of 45 sq-mm in its pareto-optimal configuration and consumes 22 W of power, well within the area and power envelopes of modern CPU chips. The second accelerator I propose aims to speed up similarity measurement calculations for semantic search in the natural language processing space. By leveraging the latency hiding concepts of multi-threading and simple scheduling mechanisms, my design maximizes functional unit utilization. This similarity measurement accelerator provides speedups of 36x-42x over optimized software running on server-class cores, while requiring 56x-58x lower energy, and only 1.3% of the area.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/116712/1/prateekt_1.pd

    Novel mutation predicted to disrupt SGOL1 protein function

    Get PDF
    Cell cycle alterations are the major cause of cancers in human. The proper segregation of sister chromatids during the cell division process defines the fate of daughter cells which is efficiently maintained by various proteomic complexes and signaling cascades. Shugosin (SGOL1) is one among those proteins which are required for phosphatise 2A protein (PP2A) localization to centromeres during division. This localization actively manages the adherence of sister chromatids at the centromeric region until the checkpoint signals are received. Wide evidences of SGOL1 genomic variants have been studied for their correlation with chromosomal instability and chromatid segregation errors. Here we used computational methods to prioritize the Single Nucleotide Polymorphism’s (SNP’s) capable of disrupting the normal functionality of SGOL1 protein. L54Q, a mutation predicted as deleterious in this study was found to be located in N-terminal coiled coil domain which is effectively involved in the proper localization of PP2A to centromere. We further examined the effect of this mutation over the translational efficiency of the SGOL1 coding gene. Our analysis revealed major structural consequences of mutation over folding conformation of the 3rd exon. Further we carried molecular dynamic simulations to unravel the structural variations induced by this mutation in SGOL1 N-terminal coiled coil domain. Root mean square deviation (RMSD), root mean square fluctuation (RMSF), H-Bond scores further supported our result. The result obtained in our study will provide a landmark to future research in understanding genotype-phenotype association of damaging non-synonymous SNPs (nsSNPs) in several other centromere proteins as done in SGOL1 and will be helpful to forecast their role in chromosomal instabilities and solid tumor formation.Keywords: SGOL1; Molecular Dynamics Simulation; Gromacs; PhD-SNP; SIFT; Polyphen; MutPredThe Egyptian Journal of Medical Human Genetics (2013) 14, 149–15

    Hardware Acceleration for Similarity Measurement in Natural Language Processing

    Get PDF
    Abstract-The continuation of Moore's law scaling, but in the absence of Dennard scaling, motivates an emphasis on energyefficient accelerator-based designs for future applications. In natural language processing, the conventional approach to automatically analyze vast text collections-using scale-out processingincurs high energy and hardware costs since the central computeintensive step of similarity measurement often entails pair-wise, allto-all comparisons. We propose a custom hardware accelerator for similarity measures that leverages data streaming, memory latency hiding, and parallel computation across variable-length threads. We evaluate our design through a combination of architectural simulation and RTL synthesis. When executing the dominant kernel in a semantic indexing application for documents, we demonstrate throughput gains of up to 42× and 58× lower energy per similaritycomputation compared to an optimized software implementation, while requiring less than 1.3% of the area of a conventional core

    High-performance advanced encryption standard (AES) security co-processor design

    No full text
    see PDFM.S.Committee Chair: Hsien-Hsin S. Le

    Sampling and power calculations for randomized evaluations

    No full text
    노트 : Based on slides by Deon Filmer, Jed Friedman, and Esther Duflo/JPal 행사명 : The Regional Impact Evaluation Worksho

    Bayesian Aggregation of Evidence for Detection and Characterization of Patterns in Multiple Noisy Observations

    No full text
    <p>Effective use of Machine Learning to support extracting maximal information from limited sensor data is one of the important research challenges in robotic sensing. This thesis develops techniques for detecting and characterizing patterns in noisy sensor data. Our Bayesian Aggregation (BA) algorithmic framework can leverage data fusion from multiple low Signal-To-Noise Ratio (SNR) sensor observations to boost the capability to detect and characterize the properties of a signal generating source or process of interest. We illustrate our research with application to the nuclear threat detection domain. Developed algorithms are applied to the problem of processing the large amounts of gamma ray spectroscopy data that can be produced in real-time by mobile radiation sensors. The thesis experimentally shows BA’s capability to boost sensor performance in detecting radiation sources of interest, even if the source is faint, partiallyoccluded, or enveloped in the noisy and variable radiation background characteristic of urban scenes. In addition, BA provides simultaneous inference of source parameters such as the source intensity or source type while detecting it. The thesis demonstrates this capability and also develops techniques to efficiently optimize these parameters over large possible setting spaces. Methods developed in this thesis are demonstrated both in simulation and in a radiation-sensing backpack that applies robotic localization techniques to enable indoor surveillance of radiation sources. The thesis further improves the BA algorithm’s capability to be robust under various detection scenarios. First, we augment BA with appropriate statistical models to improve estimation of signal components in low photon count detection, where the sensor may receive limited photon counts from either source and/or background. Second, we develop methods for online sensor reliability monitoring to create algorithms that are resilient to possible sensor faults in a data pipeline containing one or multiple sensors. Finally, we develop Retrospective BA, a variant of BA that allows reinterpretation of past sensor data in light of new information about percepts. These Retrospective capabilities include the use of Hidden Markov Models in BA to allow automatic correction of a sensor pipeline when sensor malfunction may be occur, an Anomaly- Match search strategy to efficiently optimize source hypotheses, and prototyping of a Multi-Modal Augmented PCA to more flexibly model background and nuisance source fluctuations in a dynamic environment.</p

    Minimizing Remote Accesses in MapReduce Clusters

    No full text
    Abstract—MapReduce, in particular Hadoop, is a popular framework for the distributed processing of large datasets on clusters of relatively inexpensive servers. Although Hadoop clusters are highly scalable and ensure data availability in the face of server failures, their efficiency is poor. We study data placement as a potential source of inefficiency. Despite networking improvements that have narrowed the performance gap between map tasks that access local or remote data, we find that nodes servicing remote HDFS requests see significant slowdowns of collocated map tasks due to interference effects, whereas nodes making these requests do not experience proportionate slowdowns. To reduce remote accesses, and thus avoid their destructive performance interference, we investigate an intelligent data placement policy we call ‘partitioned data placement’. We find that, in an unconstrained cluster where a job’s map tasks may be scheduled dynamically on any node over time, Hadoop’s default random data placement is effective in avoiding remote accesses. However, when task placement is restricted by long-running jobs or other reservations, partitioned data placement substantially reduces remote access rates (e.g., by as much as 86 % over random placement for a job allocated only one-third of a cluster). I

    DISTRIBUTED ON-LINE MULTI-AGENT OPTIMIZATION UNDER UNCERTAINTY: BALANCING EXPLORATION AND EXPLOITATION

    No full text
    A significant body of work exists on effectively allowing multiple agents to coordinate to achieve a shared goal. In particular, a growing body of work in the Distributed Constraint Optimization (DCOP) framework enables such coordination with different amounts of teamwork. Such algorithms can implicitly or explicitly trade-off improved solution quality with increased communication and computation requirements. However, the DCOP framework is limited to planning problems; DCOP agents must have complete and accurate knowledge about the reward function at plan time.We extend the DCOP framework, defining the Distributed Coordination of Exploration and Exploitation (DCEE) problem class to address real-world problems, such as ad-hoc wireless network optimization, via multiple novel algorithms. DCEE algorithms differ from DCOP algorithms in that they (1) are limited to a finite number of actions in a single trial, (2) attempt to maximize the on-line, rather than final, reward, (3) are unable to exhaustively explore all possible actions, and (4) may have knowledge about the distribution of rewards in the environment, but not the rewards themselves. Thus, a DCEE problem is not a type of planning problem, as DCEE algorithms must carefully balance and coordinate multiple agents' exploration and exploitation.Two classes of algorithms are introduced: static estimation algorithms perform simple calculations that allow agents to either stay or explore, and balanced exploration algorithms use knowledge about the distribution of the rewards and the time remaining in an experiment to decide whether to stay, explore, or (in some algorithms) backtrack to a previous location. These two classes of DCEE algorithms are compared in simulation and on physical robots in a complex mobile ad-hoc wireless network setting. Contrary to our expectations, we found that increasing teamwork in DCEE algorithms may lower team performance. In contrast, agents running DCOP algorithms improve their reward as teamwork increases. We term this previously unknown phenomenon the team uncertainty penalty, analyze it in both simulation and on robots, and present techniques to ameliorate the penalty.Cooperative multi-agent systems, exploration, optimization, Distributed Constraint Optimization (DCOP), Distributed Coordination of Exploration and Exploitation (DCEE)
    corecore